Introduction

Column {data-width 550}

Background

This study investigates the trends in programming language popularity from 2003 to 2023. Programming languages play a crucial role in software development, reflecting industry trends, technological advancements, and developer preferences. By analyzing language popularity over two decades, the study aims to identify long-term patterns, shifts in dominance, and compare established and emerging languages. Insights from this analysis can inform decisions in skill development, education, and industry trends.

Research Questions

  1. Which programming language has consistently maintained a high level of popularity throughout the years, and has there been any significant shift in dominance?

  2. How have the popularity percentages of programming languages changed over the entire period from 2004 to 2023, and are there any noticeable long-term trends or patterns?

  3. How do the popularity trajectories of established programming languages compare with those of emerging languages over the years?

Column

Variables

The variables in the dataset

Date

Abap

Ada

C/C++

C#

Cobol

Dart

Delphi/Pascal

Go

Groovy

Haskell

Java

Javascript

Julia

Kotlin

Lua

Matlab

Objective-C

Perl

PHP

Powershell

Python

R

Ruby

Rust

Scala

Swift

TypeScript

VBA

Visual Basic

Data

Column

Data Set

Column

Data Explanation

The dataset “Most Popular Programming Languages Since 2004” is derived from the PYPL (PopularitY of Programming Language) Index, which tracks the popularity of programming languages based on Google search trends. Each language’s popularity is expressed as a percentage of total searches, indicating its share of interest. The dataset offers insights into trends and shifts in programming language popularity since 2004, making it a valuable resource for understanding the evolving landscape of programming languages.

Popularity

Column

Popularity Overtime

Popularity Overtime (Bar)

Column

Analysis

Java and PHP have been consistently popular over the years because they’re widely used and reliable for building different types of software. Python has become increasingly popular recently because it’s easy to learn and can be used for many things like websites, data analysis, and artificial intelligence. Apart from those three, other languages have stayed about the same in popularity, like C/C++ and JavaScript, which are still important for certain tasks. Java and PHP stay popular because they’re well-established, while Python’s popularity has grown because it’s flexible and fits well with new technology trends. Python’s rise shows that languages that are easy to use and can adapt to new needs tend to become more popular, suggesting that this trend might continue in the future.

Old vs New

Column

Old Languages

Old Languages (Bar)

New Languages

New Languages (Bar)

Column

Analysis

Old languages like Java, Python, PHP, C/C++, JavaScript, Visual Basic, etc., have been popular for a long time, with shares ranging from 5% to 25%. They’re widely used and well-known in the programming world. Even though they’re older, these languages have kept up with modern trends. For example, JavaScript started as a simple language for websites but now it’s used for many other things too. Newer languages like Go, Kotlin, Rust, and Swift are slowly becoming more popular, but they usually have smaller shares, often less than 5%. They’re gaining interest, especially in specific areas. C# is an exception among the newer languages, with a higher share of around 10%. It’s widely used, especially for making software on Microsoft platforms. The older languages have been around for a long time and are used a lot, so they have higher shares compared to the newer languages.

Conclusion

Column

Limitations

The analysis relies on available data on programming language popularity, which may be subject to limitations such as data accuracy, coverage, and methodology used for measurement. Different sources may provide varying results, affecting the comprehensiveness and reliability of the analysis.

The analysis may not account for all external factors influencing programming language popularity, such as changes in technology, industry trends, developer preferences, and global events. These factors could introduce biases or limitations in the interpretation of trends and patterns.

Additionally, the dataset used for this analysis only covers the period from 2003 to 2023, which may impact the older languages observations and long-term trends of programming language popularity.

Column(Data-width=550)

Conclusion

Consistency of Popular Languages: The analysis reveals that Java and PHP have consistently maintained high levels of popularity over the years, while Python has experienced a significant shift in dominance, rising to prominence in recent years. These languages have demonstrated resilience and adaptability to changing industry demands and technological advancements.

Long-term Trends and Patterns: The analysis of popularity percentages from 2004 to 2023 highlights several long-term trends and patterns in programming language popularity. Established languages like C/C++, Java, JavaScript, Perl, PHP, and Python have maintained stability, while emerging languages have shown growth potential but generally hold smaller shares. Notable exceptions include C# and Python, which have experienced significant increases in popularity over time.

Comparison of Established and Emerging Languages: Established languages have a strong foothold in the programming landscape, with higher shares and widespread adoption, while emerging languages demonstrate potential for growth and innovation. The trajectory of established languages reflects their maturity and versatility, while emerging languages offer fresh perspectives and solutions to contemporary challenges in software development.

About Me

My name is Phillip Phuong and I am a junior at the University of Dayton.

I am pursuing a Bachelor’s of Arts in Computer Information Systems with minors in Data Analytics and Philosophy.

My projected graduation is May 2026.

Picture

Phillip Phuong

Phillip Phuong

---
title: "Popular Programming Languages"
output: 
  flexdashboard::flex_dashboard:
    theme:
      version: 4
      bootswatch: default
      navbar-bg: "purple"
    orientation: columns
    vertical_layout: fill
    source_code: embed
---

<style>

.chart-title {  /* chart_title  */
   font-size: 20px;
  }
body{ /* Normal  */
      font-size: 18px;
  }
<head>
    <base target="_blank">
</head>
</style> 

```{r setup, include=FALSE}
library(flexdashboard)
library(tidyverse)
library(ggplot2)
library(tidyr)
library
library(DT)
```

Introduction
===

Column {data-width 550}
---
### Background
This study investigates the trends in programming language popularity from 2003 to 2023. Programming languages play a crucial role in software development, reflecting industry trends, technological advancements, and developer preferences. By analyzing language popularity over two decades, the study aims to identify long-term patterns, shifts in dominance, and compare established and emerging languages. Insights from this analysis can inform decisions in skill development, education, and industry trends.

### Research Questions
1. Which programming language has consistently maintained a high level of popularity throughout the years, and has there been any significant shift in dominance?

2. How have the popularity percentages of programming languages changed over the entire period from 2004 to 2023, and are there any noticeable long-term trends or patterns?

3. How do the popularity trajectories of established programming languages compare with those of emerging languages over the years?

Column {data-width=450}
---
### Variables
**The variables in the dataset**

Date

Abap

Ada

C/C++

C#

Cobol

Dart

Delphi/Pascal

Go

Groovy

Haskell

Java

Javascript

Julia

Kotlin

Lua

Matlab

Objective-C

Perl

PHP

Powershell

Python

R

Ruby

Rust

Scala

Swift

TypeScript

VBA

Visual Basic


Data
===

Column {data-width=600}
---
```{r getData}
Languages <- read_csv("Popularity of Programming Languages from 2004 to 2023.csv")
```


### Data Set
```{r showTable}
datatable(Languages)
```



Column {data-width=400}
---
### Data Explanation

The dataset "Most Popular Programming Languages Since 2004" is derived from the PYPL (PopularitY of Programming Language) Index, which tracks the popularity of programming languages based on Google search trends. Each language's popularity is expressed as a percentage of total searches, indicating its share of interest. The dataset offers insights into trends and shifts in programming language popularity since 2004, making it a valuable resource for understanding the evolving landscape of programming languages.


Popularity
===

Column {.tabset data-width=550}
---

### Popularity Overtime
```{r Line graph for popularity}
# Change variable 'date' to a Date format and create 
Languages$Date <- as.Date(paste0(Languages$Date, "-01"), format = "%B %Y-%d")
Languages_long <- Languages %>%
  pivot_longer(cols = -Date, names_to = "Language", values_to = "Percentage")

# Creates a line graph for all the languages from 2004 to 2023
ggplot(Languages_long, aes(x = Date, y = Percentage, color = Language)) +
  geom_line() +
  scale_x_date(date_labels = "%Y", date_breaks = "5 years") + 
  labs(title = "Popularity of Programming Languages (2004-2023)",
       x = "Year",
       y = "Popularity Percentage",
       color = "Programming Language") +
  theme_minimal()
```

### Popularity Overtime (Bar)
```{r Popularity Mean bargraph}
# Calculate the mean popularity percentage for each language
language_means <- Languages %>%
  summarise(across(-Date, mean, na.rm = TRUE))

# Sorts the means of the languages into columns and arrange in ascending order
language_sorted <- language_means %>%
  pivot_longer(cols = everything(), names_to = "Language", values_to = "MeanPopularity") %>%
  arrange(MeanPopularity)

# Create a bar graph using the mean of the popularity for the languages
ggplot(language_sorted, aes(x = reorder(Language, MeanPopularity), y = MeanPopularity)) +
  geom_bar(stat = "identity", fill = "skyblue") +
  labs(title = "Mean Popularity Percentage of Programming Languages",
       x = "Programming Language",
       y = "Mean Popularity Percentage") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  coord_flip() +
  theme_minimal()
```

Column {data-width=450}
---

### Analysis
Java and PHP have been consistently popular over the years because they're widely used and reliable for building different types of software.
Python has become increasingly popular recently because it's easy to learn and can be used for many things like websites, data analysis, and artificial intelligence.
Apart from those three, other languages have stayed about the same in popularity, like C/C++ and JavaScript, which are still important for certain tasks.
Java and PHP stay popular because they're well-established, while Python's popularity has grown because it's flexible and fits well with new technology trends.
Python's rise shows that languages that are easy to use and can adapt to new needs tend to become more popular, suggesting that this trend might continue in the future.


Trends
===

Column {.tabset data-width=550}
---

### 5 Least Popular Languages
```{r Line graph for bottom 5 languages}
# Find the 5 least popular and 5 most popular languages
least_popular <- head(language_sorted$Language, 5)  
most_popular <- tail(language_sorted$Language, 5) 

# Create subsets for the least popular and most popular languages
least_popular_subset <- Languages %>%
  select(Date, all_of(least_popular))

most_popular_subset <- Languages %>%
  select(Date, all_of(most_popular))

# Create a line graph for the 5 least popular languages
ggplot(least_popular_subset, aes(x = Date)) +
  geom_line(aes(y = Julia, color = "Julia")) +
  geom_line(aes(y = Dart, color = "Dart")) +
  geom_line(aes(y = Haskell, color = "Haskell")) +
  geom_line(aes(y = Groovy, color = "Groovy")) +
  geom_line(aes(y = Ada, color = "Ada")) +
  theme_minimal() +
  labs(title = "Popularity Trends of Least Popular Programming Languages",
       x = "Date",
       y = "Popularity Percentage",
       color = "Language") +
  scale_color_manual(values = c(Julia = "blue", Dart = "red", Haskell = "green", 
                                Groovy = "orange", Ada = "purple"))
```

### 5 Most Popular Languages
```{r Line graph for top 5 languages}
# Create line graph for the 5 most popular languages
ggplot(most_popular_subset, aes(x = Date)) +
  geom_line(aes(y = JavaScript, color = "JavaScript")) +
  geom_line(aes(y = `C/C++`, color = "`C/C++`")) +
  geom_line(aes(y = PHP, color = "PHP")) +
  geom_line(aes(y = Python, color = "Python")) +
  geom_line(aes(y = Java, color = "Java")) +
  theme_minimal() +
  labs(title = "Popularity Trends of Most Popular Programming Languages",
       x = "Date",
       y = "Popularity Percentage",
       color = "Language") +
  scale_color_manual(values = c(JavaScript = "blue", "`C/C++`" = "red", PHP = "green", 
                                Python = "orange", Java = "purple"))
```


Column {data-width=450}
---

### Analysis
Over the period from 2004 to 2023, the top 5 programming languages—C/C++, Java, JavaScript, PHP, and Python—consistently held a significant portion of the overall popularity percentages, ranging from 10% to 30%. However, overtime Java and PHP are slowly losing popularity, but JavaScript and C/C++ are more stable. Python has seen the biggest increase in popularity and will most likely stay popular. Conversely, Ada, Dart, Groovy, Haskell, and Julia consistently remained among the least popular languages, struggling to reach even 1% of popularity percentages, likely due to niche use cases and limited industry support. Conversely, Ada, Dart, Groovy, Haskell, and Julia consistently remained among the least popular languages, struggling to reach even 1% of popularity percentages, likely due to niche use cases and limited industry support.


Old vs New
===

Column {.tabset data-width=550}
---

### Old Languages
```{r Old Lanuguage Line Graph}
# Create a set for old languages and new languages
OldLang <- c("Abap", "Ada", "C/C++", "Cobol", "Delphi/Pascal", "Haskell", "Java", "JavaScript",
           "Lua", "Matlab", "Objective-C", "Perl", "PHP", "Python", "R", "Ruby", "VBA", "Visual Basic")

NewLang <- c("C#", "Dart", "Go", "Groovy", "Julia", "Kotlin", "Powershell", "Rust", "Scala",
           "Swift", "TypeScript")

# Create datasets of means for old languages (before 2000) and new languages (after 2000)
OldLangMeans <- language_means %>%
  select(c("Abap", "Ada", "C/C++", "Cobol", "Delphi/Pascal", "Haskell", "Java", "JavaScript",
             "Lua", "Matlab", "Objective-C", "Perl", "PHP", "Python", "R", "Ruby", "VBA", "Visual Basic"))

NewLangMeans <- language_means%>%
  select(c("C#", "Dart", "Go", "Groovy", "Julia", "Kotlin", "Powershell", "Rust", "Scala",
             "Swift", "TypeScript"))

# Sorts the means of the old languages into columns and arrange in ascending order
OldLangMeansLong <- OldLangMeans %>%
  pivot_longer(cols = everything(), names_to = "Language", values_to = "MeanPopularity") %>%
  arrange(MeanPopularity)

# Create line graph for old languages
ggplot(Languages_long[Languages_long$Language %in% OldLang, ], aes(x = Date, y = Percentage, color = Language)) +
  geom_line() +
  theme_minimal() +
  labs(title = "Programming Languages before 2000s",
       x = "Year",
       y = "Popularity Percentage",
       color = "Programming Language")
```

### Old Languages (Bar)
```{r Old Lanuguage Bar Graph}
# Create a bar graph using the mean of the popularity for the old languages
ggplot(OldLangMeansLong, aes(x = reorder(Language, MeanPopularity), y = MeanPopularity)) +
  geom_bar(stat = "identity", fill = "skyblue") +
  labs(title = "Mean Popularity of Old Programming Languages",
       x = "Programming Language",
       y = "Mean Popularity") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  coord_flip() +
  theme_minimal()
```

### New Languages
```{r New Lanuguage Line Graph}
# Sorts the means of the new languages into columns and arrange in ascending order
NewLangMeansLong <- NewLangMeans %>%
  pivot_longer(cols = everything(), names_to = "Language", values_to = "MeanPopularity") %>%
  arrange(MeanPopularity)

# Create line graph for new languages
ggplot(Languages_long[Languages_long$Language %in% NewLang, ], aes(x = Date, y = Percentage, color = Language)) +
  geom_line() +
  theme_minimal() +
  labs(title = "Programming Languages after 2000s",
       x = "Year",
       y = "Popularity Percentage",
       color = "Programming Language")
```

### New Languages (Bar)
```{r New Lanuguage Bar Graph}
# Create a bar graph using the mean of the popularity for the new languages
ggplot(NewLangMeansLong, aes(x = reorder(Language, MeanPopularity), y = MeanPopularity)) +
  geom_bar(stat = "identity", fill = "skyblue") +
  labs(title = "Mean Popularity of Old Programming Languages",
       x = "Programming Language",
       y = "Mean Popularity") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  coord_flip() +
  theme_minimal()
```

Column {data-width=450}
---

### Analysis
Old languages like Java, Python, PHP, C/C++, JavaScript, Visual Basic, etc., have been popular for a long time, with shares ranging from 5% to 25%. They're widely used and well-known in the programming world.
Even though they're older, these languages have kept up with modern trends. For example, JavaScript started as a simple language for websites but now it's used for many other things too. 
Newer languages like Go, Kotlin, Rust, and Swift are slowly becoming more popular, but they usually have smaller shares, often less than 5%. They're gaining interest, especially in specific areas.
C# is an exception among the newer languages, with a higher share of around 10%. It's widely used, especially for making software on Microsoft platforms. The older languages have been around for a long time and are used a lot, so they have higher shares compared to the newer languages. 

Conclusion
===

Column {data-width=450}
---
### Limitations
The analysis relies on available data on programming language popularity, which may be subject to limitations such as data accuracy, coverage, and methodology used for measurement. Different sources may provide varying results, affecting the comprehensiveness and reliability of the analysis.

The analysis may not account for all external factors influencing programming language popularity, such as changes in technology, industry trends, developer preferences, and global events. These factors could introduce biases or limitations in the interpretation of trends and patterns.

Additionally, the dataset used for this analysis only covers the period from 2003 to 2023, which may impact the older languages observations and long-term trends of programming language popularity.

### References
On Kaggle by Muhammad Khalid [https://www.kaggle.com/datasets/muhammadkhalid/most-popular-programming-languages-since-2004]

Column(Data-width=550)
---
### Conclusion
Consistency of Popular Languages: The analysis reveals that Java and PHP have consistently maintained high levels of popularity over the years, while Python has experienced a significant shift in dominance, rising to prominence in recent years. These languages have demonstrated resilience and adaptability to changing industry demands and technological advancements.

Long-term Trends and Patterns: The analysis of popularity percentages from 2004 to 2023 highlights several long-term trends and patterns in programming language popularity. Established languages like C/C++, Java, JavaScript, Perl, PHP, and Python have maintained stability, while emerging languages have shown growth potential but generally hold smaller shares. Notable exceptions include C# and Python, which have experienced significant increases in popularity over time.

Comparison of Established and Emerging Languages: Established languages have a strong foothold in the programming landscape, with higher shares and widespread adoption, while emerging languages demonstrate potential for growth and innovation. The trajectory of established languages reflects their maturity and versatility, while emerging languages offer fresh perspectives and solutions to contemporary challenges in software development.

About Me
===
My name is Phillip Phuong and I am a junior at the University of Dayton.

I am pursuing a Bachelor’s of Arts in Computer Information Systems with minors in Data Analytics and Philosophy. 

My projected graduation is May 2026.

### Picture

```{r , fig.width=6, echo=FALSE, fig.cap="Phillip Phuong", fig.align='right'}
knitr::include_graphics("BodyImage.jpg")
```